XHAMI - extended HDFS and MapReduce interface for Big Data image processing applications in cloud computing environments

نویسندگان

  • Raghavendra Kune
  • Pramod Konugurthi
  • Arun Agarwal
  • C. Raghavendra Rao
  • Rajkumar Buyya
چکیده

Hadoop Distributed File System (HDFS) and MapReduce model have become popular technologies for large scale data organization and analysis. Existing model of data organization and processing in Hadoop using HDFS and MapReduce are ideally tailored for search and data parallel applications, for which there is no need of data dependency with its neighbouring/adjacent data. However, many scientific applications such as image mining, data mining, knowledge data mining, and satellite image processing are dependent on adjacent data for processing and analysis. In this paper, we identify the requirements of the overlapped data organization and propose a two phase extensions to HDFS and MapReduce programming model, called XHAMI, to address them. The extended interfaces as presented as APIs and implemented in the context of Image Processing (IP) application domain. We demonstrated effectiveness of XHAMI through case studies of image processing functions along with the results. Although XHAMI has little overhead in data storage and input/output operations, it greatly enhances the system performance and simplifies the application development process. Our proposed system, XHAMI, works without any changes for the existing MapReduce models, and can be utilised by many applications where there is a requirement of overlapped data.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Cloud Computing Technology Algorithms Capabilities in Managing and Processing Big Data in Business Organizations: MapReduce, Hadoop, Parallel Programming

The objective of this study is to verify the importance of the capabilities of cloud computing services in managing and analyzing big data in business organizations because the rapid development in the use of information technology in general and network technology in particular, has led to the trend of many organizations to make their applications available for use via electronic platforms hos...

متن کامل

Big Data Processing with Hadoop-MapReduce in Cloud Systems

Received Oct 10 th , 2012 Accepted Oct 31 th , 2012 Today, we‟re surrounded by data like oxygen. The exponential growth of data first presented challenges to cutting-edge businesses such as Google, Yahoo, Amazon, Microsoft, Facebook, Twitter etc. Data volumes to be processed by cloud applications are growing much faster than computing power. This growth demands new strategies for processing and...

متن کامل

Implementation of image processing system using handover technique with map reduce based on big data in the cloud environment

Cloud computing is the one of the emerging techniques to process the big data. Cloud computing is also, known as service on demand. Large set or large volume of data is known as big data. Processing big data (MRI images and DICOM images) normally takes more time. Hard tasks such as handling big data can be solved by using the concepts of hadoop. Enhancing the hadoop concept will help the user t...

متن کامل

Study on Hadoop and MapReduce Framework

Hadoop, a Java Software Framework, supports data intensive data-intensive distributed applications. Hadoop is developed under open source license. It enables applications to work with thousands of nodes and petabytes of data. Hadoop has formed framework for Big Data analysis. Its MapReduce technique made it more useful for huge amout of data processing. Hadoop is incorporated with cloud computi...

متن کامل

Morpho: A decoupled MapReduce framework for elastic cloud computing

MapReduce as a service enjoyswide adoption in commercial clouds today [3,23]. Butmost cloud providers just deploy native Hadoop [24] systems on their cloud platforms to provide MapReduce services without any adaptation to these virtualized environments [6,25]. In cloud environments, the basic executing units of data processing are virtual machines. Each user’s virtual cluster needs to deploy HD...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:
  • Softw., Pract. Exper.

دوره 47  شماره 

صفحات  -

تاریخ انتشار 2017